13 research outputs found
Tight Bounds for Local Glivenko-Cantelli
This paper addresses the statistical problem of estimating the infinite-norm
deviation from the empirical mean to the distribution mean for high-dimensional
distributions on , potentially with . Unlike traditional
bounds as in the classical Glivenko-Cantelli theorem, we explore the
instance-dependent convergence behavior. For product distributions, we provide
the exact non-asymptotic behavior of the expected maximum deviation, revealing
various regimes of decay. In particular, these tight bounds demonstrate the
necessity of a previously proposed factor for an upper bound, answering a
corresponding COLT 2023 open problem. We also consider general distributions on
and provide the tightest possible bounds for the maximum deviation
of the empirical mean given only the mean statistic. Along the way, we prove a
localized version of the Dvoretzky-Kiefer-Wolfowitz inequality. Additionally,
we present some results for two other cases, one where the deviation is
measured in some -norm, and the other where the distribution is supported on
a continuous domain , and also provide some high-probability bounds
for the maximum deviation in the independent Bernoulli case.Comment: ALT 202
Probabilistic bounds on the Traveling Salesman Problem and the Traveling Repairman Problem
The traveling salesman problem (-TSP) seeks a tour of minimal length
that visits a subset of points. The traveling repairman problem (TRP)
seeks a complete tour with minimal latency. This paper provides constant-factor
probabilistic approximations of both problems. We first show that the optimal
length of the -TSP path grows at a rate of
. The proof
provides a constant-factor approximation scheme, which solves a TSP in a
high-concentration zone -- leveraging large deviations of local concentrations.
Then, we show that the optimal TRP latency grows at a rate of . This result extends the classical Beardwood-Halton-Hammersley theorem to
the TRP. Again, the proof provides a constant-factor approximation scheme,
which visits zones by decreasing order of probability density. We discuss
practical implications of this result in the design of transportation and
logistics systems. Finally, we propose dedicated notions of fairness --
randomized population-based fairness for the -TSP and geographical fairness
for the TRP -- and give algorithms to balance efficiency and fairness
Memory-Constrained Algorithms for Convex Optimization via Recursive Cutting-Planes
We propose a family of recursive cutting-plane algorithms to solve
feasibility problems with constrained memory, which can also be used for
first-order convex optimization. Precisely, in order to find a point within a
ball of radius with a separation oracle in dimension -- or to
minimize -Lipschitz convex functions to accuracy over the unit
ball -- our algorithms use
bits of memory, and make
oracle calls, for some universal constant . The family is
parametrized by and provides an oracle-complexity/memory trade-off in
the sub-polynomial regime . While several works
gave lower-bound trade-offs (impossibility results) -- we explicit here their
dependence with , showing that these also hold in any
sub-polynomial regime -- to the best of our knowledge this is the first class
of algorithms that provides a positive trade-off between gradient descent and
cutting-plane methods in any regime with . The
algorithms divide the variables into blocks and optimize over blocks
sequentially, with approximate separation vectors constructed using a variant
of Vaidya's method. In the regime , our algorithm
with achieves the information-theoretic optimal memory usage and improves
the oracle-complexity of gradient descent
Additional Results and Extensions for the paper "Probabilistic bounds on the Traveling Salesman Problem and the Traveling Repairman Problem''
This technical report provides additional results for the main paper
``Probabilistic bounds on the Traveling Salesman Problem (TSP) and the
Traveling Repairman Problem (TRP)''. For the TSP, we extend the
probabilistic bounds derived in the main paper to the case of distributions
with general densities. For the TRP, we propose a utility-based notion of
fairness and derive constant-factor probabilistic bounds for this objective,
thus extending the TRP bounds from the main paper to non-linear utilities
On the Length of Monotone Paths in Polyhedra
Motivated by the problem of bounding the number of iterations of the Simplex
algorithm we investigate the possible lengths of monotone paths followed by the
Simplex method inside the oriented graphs of polyhedra (oriented by the
objective function). We consider both the shortest and the longest monotone
paths and estimate the monotone diameter and height of polyhedra. Our analysis
applies to transportation polytopes, matroid polytopes, matching polytopes,
shortest-path polytopes, and the TSP, among others. We begin by showing that
combinatorial cubes have monotone and Bland pivot height bounded by their
dimension and that in fact all monotone paths of zonotopes are no larger than
the number of edge directions of the zonotope. We later use this to show that
several polytopes have polynomial-size pivot height, for all pivot rules. In
contrast, we show that many well-known combinatorial polytopes have
exponentially-long monotone paths. Surprisingly, for some famous pivot rules,
e.g., greatest improvement and steepest edge, these same polytopes have
polynomial-size simplex paths.Comment: 24 pages, 8 figure
Universal Online Learning with Bounded Loss: Reduction to Binary Classification
We study universal consistency of non-i.i.d. processes in the context of
online learning. A stochastic process is said to admit universal consistency if
there exists a learner that achieves vanishing average loss for any measurable
response function on this process. When the loss function is unbounded,
Blanchard et al. showed that the only processes admitting strong universal
consistency are those taking a finite number of values almost surely. However,
when the loss function is bounded, the class of processes admitting strong
universal consistency is much richer and its characterization could be
dependent on the response setting (Hanneke). In this paper, we show that this
class of processes is independent from the response setting thereby closing an
open question (Hanneke, Open Problem 3). Specifically, we show that the class
of processes that admit universal online learning is the same for binary
classification as for multiclass classification with countable number of
classes. Consequently, any output setting with bounded loss can be reduced to
binary classification. Our reduction is constructive and practical. Indeed, we
show that the nearest neighbor algorithm is transported by our construction.
For binary classification on a process admitting strong universal learning, we
prove that nearest neighbor successfully learns at least all finite unions of
intervals
Evaluation of Cellular Responses for the Diagnosis of Allergic Bronchopulmonary Mycosis: A Preliminary Study in Cystic Fibrosis Patients
International audienceBackground: Allergic bronchopulmonary mycosis (ABPM) is an underestimated allergic disease due to fungi. Most reported cases are caused by Aspergillus fumigatus (Af) and are referred to as allergic bronchopulmonary aspergillosis (ABPA). The main risk factor of ABPA is a history of lung disease, such as cystic fibrosis, asthma, or chronic obstructive pulmonary disease. The main diagnostic criteria for ABPA rely on the evaluation of humoral IgE and IgG responses to Af extracts, although these cannot discriminate Af sensitization and ABPA. Moreover, fungi other than Af have been incriminated. Flow cytometric evaluation of functional responses of basophils and lymphocytes in the context of allergic diseases is gaining momentum. Objectives: We hypothesized that the detection of functional responses through basophil and lymphocyte activation tests might be useful for ABPM diagnosis. We present here the results of a pilot study comparing the performance of these cellular assays vs. usual diagnostic criteria in a cystic fibrosis (CF) cohort. Methods: Ex vivo basophil activation test (BAT) is a diagnostic tool highlighting an immediate hypersensitivity mechanism against an allergen, e.g., through CD63 upregulation as an indirect measure of degranulation. Lymphocyte stimulation test (LST) relies on the upregulation of activation markers, such as CD69, after incubation with allergen(s), to explain delayed hypersensitivity. These assays were performed with Af, Penicillium, and Alternaria extracts in 29 adult CF patients. Results: BAT responses of ABPA patients were higher than those of sensitized or control CF patients. The highest LST result was for a woman who developed ABPA 3 months after the tests, despite the absence of specific IgG and IgE to Af at the time of the initial investigation. Michel et al. Cellular Responses of ABPM Conclusion: We conclude that basophil and lymphocyte activation tests could enhance the diagnosis of allergic mycosis, compared to usual humoral markers. Further studies with larger cohorts and addressing both mold extracts and mold relevant molecules are needed in order to confirm and extend the application of this personalized medicine approach